A hierarchical duration model for speech recognition based on the ANGIE framework

نویسندگان

  • Grace Chung
  • Stephanie Seneff
چکیده

This paper presents a hierarchical duration model applied to enhance speech recognition. The model is based on the novel ANGIE framework which is a ̄exible uni®ed sublexical representation designed for speech applications. This duration model captures duration phenomena operating at the phonological, phonemic, syllabic and morphological levels. At the core of the modelling scheme is a hierarchical normalization procedure performed on the ANGIE parse structure. From this, we derive a robust measure for the rate of speech. The model uses two sets of statistical models ± a ®rst set based on relative duration between sublexical units and a second set based on absolute duration that has been normalized with respect to the speaking rate. We have used this paradigm to explore some speech timing phenomena such as the secondary e€ects on relative duration due to variations in speaking rate, the characteristics of anomalously slow words, and prepausal lengthening e€ects. Finally, we successfully demonstrate the utility of durational information for recognition applications. In phonetic recognition, we achieve a relative improvement of up to 7.7% by incorporating our model over and above a standard phone duration model, and similarly, in a word spotting task, an improvement from 89.3 to 91.6 (FOM) has resulted. Ó 1999 Elsevier Science B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hierarchical duration modelling for speech recognition using the ANGIE framework

We describe a novel hierarchical duration model for speech recognition. The modelling scheme is based on the angie framework, a exible uni ed sublexical representation for speech applications. Our duration model captures contextual factors that in uence duration of sublexical units at multiple linguistic levels simultaneously, using both relative and absolute duration information. The modelling...

متن کامل

Subword lexical modelling for speech recognition

In this work, we introduce and develop a novel framework, angie, for modelling subword lexical phenomena in speech recognition. Our framework provides a exible and powerful mechanism for capturing morphology, syllabi cation, phonology, and other subword e ects in a hierarchical manner which maximizes sharing of subword structures. Angie models the subword structure within a context-free grammar...

متن کامل

Providing Sublexical Constraints for Word Spotting within the Angie Framework1

We describe our recent work in implementing a word-spotting system based on the ANGIE framework and the effects of varying the nature of the sublexical constraints placed upon the wordspotter’s filler model. ANGIE is a framework for modelling speech where the morphological and phonological substructures of words are jointly characterized by a context-free grammar and are represented in a multi-...

متن کامل

Angie: a New Framework for Speech Analysis Based on Morpho-phonological Modelling1

This paper describes a new system for speech analysis, ANGIE, which characterizes word substructure in terms of a trainable grammar. ANGIE capture morpho-phonemic and phonological phenomena through a hierarchical framework. The terminal categories can be alternately letters or phone units, yielding a reversible letter-tosound/sound-to-letter system. In conjunction with a segment network and aco...

متن کامل

Providing sublexical constraints for word spotting within the ANGIE framework

We describe our recent work in implementing a word-spotting system based on the ANGIE framework and the effects of varying the nature of the sublexical constraints placed upon the wordspotter’s filler model. ANGIE is a framework for modelling speech where the morphological and phonological substructures of words are jointly characterized by a context-free grammar and are represented in a multi-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Speech Communication

دوره 27  شماره 

صفحات  -

تاریخ انتشار 1999